<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="utf-8" />
  <meta name="generator" content="pandoc,fixuphtml" />
  <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
  <title>Contextual Keywords</title>
  <style type="text/css">
      code{white-space: pre-wrap;}
      span.smallcaps{font-variant: small-caps;}
      span.underline{text-decoration: underline;}
      div.column{display: inline-block; vertical-align: top; width: 50%;}
  </style>
  <link rel="stylesheet" href="../resources/jdk-default.css" />
  <!--[if lt IE 9]>
    <script src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.3/html5shiv-printshiv.min.js"></script>
  <![endif]-->
  <link rel="stylesheet" href="../resources/spec-changes.css" />
</head>
<body>
<header id="title-block-header">
<h1 class="title">Contextual Keywords</h1>
<p class="subtitle">Changes to the Java® Language Specification • Version 17.0.2+8-LTS-86</p>
</header>
<nav id="TOC" title="Table Of Contents">
<ul>
<li><a href="#jls-3">Chapter 3: Lexical Structure</a><ul>
<li><a href="#jls-3.2">3.2 Lexical Translations</a></li>
<li><a href="#jls-3.5">3.5 Input Elements and Tokens</a></li>
<li><a href="#jls-3.8">3.8 Identifiers</a></li>
<li><a href="#jls-3.9">3.9 Keywords</a></li>
<li><a href="#jls-3.12">3.12 Operators</a></li>
</ul></li>
<li><a href="#jls-6">Chapter 6: Names</a><ul>
<li><a href="#jls-6.1">6.1 Declarations</a></li>
<li><a href="#jls-6.5">6.5 Determining the Meaning of a Name</a><ul>
<li><a href="#jls-6.5.7">6.5.7 Meaning of Method Names</a><ul>
<li><a href="#jls-6.5.7.1">6.5.7.1 Simple Method Names</a></li>
</ul></li>
</ul></li>
</ul></li>
</ul>
</nav>
<main><p>This document describes changes to the <a href="https://docs.oracle.com/javase/specs/jls/se16/html">Java Language Specification</a> to clarify the use of context when applying the lexical grammar, particularly in the identification of contextual keywords (formerly described as &quot;restricted identifiers&quot; and &quot;restricted keywords&quot;).</p>
<p>We've experimented with two approaches for these keywords since Java SE 9. The first is to determine whether a character sequence is a keyword by describing where it appears in terms of the syntactic grammar. The second is to treat the keyword as an <em>Identifier</em> token, but later reference it literally, just like a keyword, in the syntactic grammar.</p>
<p>As we introduce new forms of contextual keywords like <code>non-sealed</code>, the second approach no longer works (<code>non-sealed</code> is a sequence of tokens, not a single identifier). So this revision standardizes on the first—contextual keywords are identified based on where they will appear in the syntactic grammar. See <a href="#jls-3.9">3.9</a> for further discussion.</p>
<p>Changes are described with respect to existing sections of the JLS. New text is indicated <strong>like this</strong> and deleted text is indicated <del>like this</del>. Explanation and discussion, as needed, is set aside in grey boxes.</p>
<h2 id="jls-3">Chapter 3: Lexical Structure</h2>
<h3 id="jls-3.2">3.2 Lexical Translations</h3>
<p>A raw Unicode character stream is translated into a sequence of tokens, using the following three lexical translation steps, which are applied in turn:</p>
<ol type="1">
<li><p>A translation of Unicode escapes (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-3.html#jls-3.3">3.3</a>) in the raw stream of Unicode characters to the corresponding Unicode character. A Unicode escape of the form <code>\uxxxx</code>, where <code>xxxx</code> is a hexadecimal value, represents the UTF-16 code unit whose encoding is <code>xxxx</code>. This translation step allows any program to be expressed using only ASCII characters.</p></li>
<li><p>A translation of the Unicode stream resulting from step 1 into a stream of input characters and line terminators (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-3.html#jls-3.4">3.4</a>).</p></li>
<li><p>A translation of the stream of input characters and line terminators resulting from step 2 into a sequence of input elements (<a href="#jls-3.5">3.5</a>) which, after white space (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-3.html#jls-3.6">3.6</a>) and comments (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-3.html#jls-3.7">3.7</a>) are discarded, comprise the tokens (<a href="#jls-3.5">3.5</a>) that are the terminal symbols of the syntactic grammar (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-2.html#jls-2.3">2.3</a>).</p></li>
</ol>
<p><del>The</del> <strong>In general, the</strong> longest possible translation is used at each step, even if the result does not ultimately make a correct program while another lexical translation would. <del>There is one exception: if lexical translation occurs in a type context (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-4.html#jls-4.11">4.11</a>) and the input stream has two or more consecutive <code>&gt;</code> characters that are followed by a non-<code>&gt;</code> character, then each <code>&gt;</code> character must be translated to the token for the numerical comparison operator <code>&gt;</code>.</del> <strong>Some exceptions exist, as described in <a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-3.html#jls-3.3">3.3</a> and <a href="#jls-3.5">3.5</a>.</strong></p>
<div class="deleted">
<blockquote>
<p>The input characters <code>a--b</code> are tokenized (<a href="#jls-3.5">3.5</a>) as <code>a</code>, <code>--</code>, <code>b</code>, which is not part of any grammatically correct program, even though the tokenization <code>a</code>, <code>-</code>, <code>-</code>, <code>b</code> could be part of a grammatically correct program.</p>
<p>Without the rule for <code>&gt;</code> characters, two consecutive <code>&gt;</code> brackets in a type such as <code>List&lt;List&lt;String&gt;&gt;</code> would be tokenized as the signed right shift operator <code>&gt;&gt;</code>, while three consecutive <code>&gt;</code> brackets in a type such as <code>List&lt;List&lt;List&lt;String&gt;&gt;&gt;</code> would be tokenized as the unsigned right shift operator <code>&gt;&gt;&gt;</code>. Worse, the tokenization of four or more consecutive <code>&gt;</code> brackets in a type such as <code>List&lt;List&lt;List&lt;List&lt;String&gt;&gt;&gt;&gt;</code> would be ambiguous, as various combinations of <code>&gt;</code>, <code>&gt;&gt;</code>, and <code>&gt;&gt;&gt;</code> tokens could represent the <code>&gt;&gt;&gt;&gt;</code> characters.</p>
</blockquote>
</div>
<div class="editorial">
<p>The previous assertion of just one exception was too strong.</p>
<p>In Step 1, the character sequence <code>\\u1234</code> is treated as 7 distinct characters, not two (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-3.html#jls-3.3">3.3</a>). This is appropriate, but represents another exception to the &quot;longest possible translation&quot; rule.</p>
<p>In Step 3, contextual keywords may sometimes be hyphenated (though none are currently), and in some contexts the hyphen would be treated as a distinct token.</p>
<p>It's better to leave discussion about treatment of ambiguities to each section.</p>
<p>The discussion about <code>&gt;</code> characters has moved to <a href="#jls-3.12">3.12</a>.</p>
</div>
<h3 id="jls-3.5">3.5 Input Elements and Tokens</h3>
<p>The input characters and line terminators that result from Unicode escape processing (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-3.html#jls-3.3">3.3</a>) and then input line recognition (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-3.html#jls-3.4">3.4</a>) are reduced to a sequence of <em>input elements</em>.</p>
<dl>
<dt><em>Input:</em></dt>
<dd>{<em>InputElement</em>} [<em>Sub</em>]
</dd>
<dt><em>InputElement:</em></dt>
<dd><em>WhiteSpace</em>
</dd>
<dd><em>Comment</em>
</dd>
<dd><em>Token</em>
</dd>
<dt><em>Token:</em></dt>
<dd><em>Identifier</em>
</dd>
<dd><em>Keyword</em>
</dd>
<dd><em>Literal</em>
</dd>
<dd><em>Separator</em>
</dd>
<dd><em>Operator</em>
</dd>
<dt><em>Sub:</em></dt>
<dd><em>the ASCII SUB character, also known as &quot;control-Z&quot;</em>
</dd>
</dl>
<p>Those input elements that are not white space or comments are <em>tokens</em>. The tokens are the terminal symbols of the syntactic grammar (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-2.html#jls-2.3">2.3</a>).</p>
<div class="inserted">
<p>The <em>Input</em> production is ambiguous, meaning that, for some sequences of input characters and line terminators, there is more than one way to match the <em>Input</em> production to the sequence. Ambiguities are resolved as follows:</p>
<ul>
<li><p>Depending on context, as described in <a href="#jls-3.9">3.9</a>, a number of specific character sequences are sometimes interpreted as <em>contextual keywords</em>, and sometimes treated as other non-keyword tokens.</p></li>
<li><p>Depending on context, as described in <a href="#jls-3.12">3.12</a>, the input character <code>&gt;</code> is sometimes interpreted as an operator, and sometimes combined with adjacent characters (as in <code>&gt;&gt;</code>) to form a different operator.</p></li>
<li><p>In every other circumstance, the longest matching token is always preferred, even if the result does not ultimately make a correct program while another lexical translation would.</p></li>
</ul>
</div>
<p>White space (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-3.html#jls-3.6">3.6</a>) and comments (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-3.html#jls-3.7">3.7</a>) can serve to separate tokens that, if adjacent, might be tokenized in another manner. <del>For example, the ASCII characters <code>-</code> and <code>=</code> in the input can form the operator token <code>-=</code> (<a href="#jls-3.12">3.12</a>) only if there is no intervening white space or comment.</del></p>
<div class="inserted">
<blockquote>
<p>So the input characters <code>staticvoid</code> are interpreted as a single identifier token, while the input characters <code>static void</code> (with an ASCII SP character in between <code>c</code> and <code>v</code>) are interpreted as a pair of keyword tokens, <code>static</code> and <code>void</code>, separated by whitespace.</p>
<p>Similarly, the input characters <code>a--b</code> are interpreted as <code>a</code>, <code>--</code>, and <code>b</code>, which is not part of any grammatically correct program, even though the interpretation <code>a</code>, <code>-</code>, <code>-</code>, and <code>b</code> could be part of a grammatically correct program. The input characters <code>a- -b</code> (with an ASCII SP character in between the two <code>-</code> characters), on the other hand, are interpreted as <code>a</code>, <code>-</code>, <code>-</code>, and <code>b</code>.</p>
</blockquote>
</div>
<p>As a special concession for compatibility with certain operating systems, the ASCII SUB character (<code>\u001a</code>, or control-Z) is ignored if it is the last character in the escaped input stream.</p>
<p>Consider two tokens <code>x</code> and <code>y</code> in the resulting input stream. If <code>x</code> precedes <code>y</code>, then we say that <code>x</code> is <em>to the left of</em> <code>y</code> and that <code>y</code> is <em>to the right of</em> <code>x</code>.</p>
<blockquote>
<p>For example, in this simple piece of code:</p>
<pre><code>class Empty {
}</code></pre>
<p>we say that the <code>}</code> token is to the right of the <code>{</code> token, even though it appears, in this two-dimensional representation, downward and to the left of the <code>{</code> token. This convention about the use of the words left and right allows us to speak, for example, of the right-hand operand of a binary operator or of the left-hand side of an assignment.</p>
</blockquote>
<h3 id="jls-3.8">3.8 Identifiers</h3>
<p>An <em>identifier</em> is an unlimited-length sequence of <em>Java letters</em> and <em>Java digits</em>, the first of which must be a <em>Java letter</em>.</p>
<dl>
<dt><em>Identifier:</em></dt>
<dd><em>IdentifierChars</em> <em>but not a</em> <em>Keyword</em> <em>or</em> <em>BooleanLiteral</em> <em>or</em> <em>NullLiteral</em>
</dd>
<dt><em>IdentifierChars:</em></dt>
<dd><em>JavaLetter</em> {<em>JavaLetterOrDigit</em>}
</dd>
<dt><em>JavaLetter:</em></dt>
<dd><em>any Unicode character that is a &quot;Java letter&quot;</em>
</dd>
<dt><em>JavaLetterOrDigit:</em></dt>
<dd><em>any Unicode character that is a &quot;Java letter-or-digit&quot;</em>
</dd>
</dl>
<p>A &quot;Java letter&quot; is a character for which the method <code>Character.isJavaIdentifierStart(int)</code> returns true.</p>
<p>A &quot;Java letter-or-digit&quot; is a character for which the method <code>Character.isJavaIdentifierPart(int)</code> returns true.</p>
<blockquote>
<p>The &quot;Java letters&quot; include uppercase and lowercase ASCII Latin letters <code>A-Z</code> (<code>\u0041-\u005a</code>), and <code>a-z</code> (<code>\u0061-\u007a</code>), and, for historical reasons, the ASCII dollar sign (<code>$</code>, or <code>\u0024</code>) and underscore (<code>_</code>, or <code>\u005f</code>). The dollar sign should be used only in mechanically generated source code or, rarely, to access pre-existing names on legacy systems. The underscore may be used in identifiers formed of two or more characters, but it cannot be used as a one-character identifier due to being a keyword.</p>
</blockquote>
<blockquote>
<p>The &quot;Java digits&quot; include the ASCII digits <code>0-9</code> (<code>\u0030-\u0039</code>).</p>
</blockquote>
<p>Letters and digits may be drawn from the entire Unicode character set, which supports most writing scripts in use in the world today, including the large sets for Chinese, Japanese, and Korean. This allows programmers to use identifiers in their programs that are written in their native languages.</p>
<p><del>An identifier cannot have the same spelling (Unicode character sequence) as a keyword (<a href="#jls-3.9">3.9</a>), boolean literal (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-3.html#jls-3.10.3">3.10.3</a>), or the null literal (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-3.html#jls-3.10.8">3.10.8</a>), or a compile-time error occurs.</del></p>
<p><strong>A sequence of input characters does not represent an identifier if (in a particular context) it represents a keyword (<a href="#jls-3.9">3.9</a>), a boolean literal (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-3.html#jls-3.10.3">3.10.3</a>), or the null literal (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-3.html#jls-3.10.8">3.10.8</a>).</strong></p>
<div class="editorial">
<p>Two problems with the former phrasing:</p>
<ul>
<li><p>&quot;Have the same spelling&quot; is unclear about whether it takes context into account. Looking purely at their characters, certain identifiers can, in fact, have the same spelling as certain keywords.</p></li>
<li><p>A programmer accidentally using a keyword where they intended to use an identifier won't necessarily get a compile-time error—it's also possible that the program is just interpreted differently than expected. (E.g., in a class body, <code>Public Foo() { ... }</code> is a method declaration, while <code>public Foo() { ... }</code> is a constructor declaration.)</p></li>
</ul>
</div>
<p>Two identifiers are the same only if, after ignoring characters that are ignorable, the identifiers have the same Unicode character for each letter or digit. An ignorable character is a character for which the method <code>Character.isIdentifierIgnorable(int)</code> returns true. Identifiers that have the same external appearance may yet be different.</p>
<blockquote>
<p>For example, the identifiers consisting of the single letters LATIN CAPITAL LETTER A (<code>A</code>, <code>\u0041</code>), LATIN SMALL LETTER A (<code>a</code>, <code>\u0061</code>), GREEK CAPITAL LETTER ALPHA (<code>A</code>, <code>\u0391</code>), CYRILLIC SMALL LETTER A (<code>a</code>, <code>\u0430</code>) and MATHEMATICAL BOLD ITALIC SMALL A (<code>a</code>, <code>\ud835\udc82</code>) are all different.</p>
<p>Unicode composite characters are different from their canonical equivalent decomposed characters. For example, a LATIN CAPITAL LETTER A ACUTE (<code>Á</code>, <code>\u00c1</code>) is different from a LATIN CAPITAL LETTER A (<code>A</code>, <code>\u0041</code>) immediately followed by a NON-SPACING ACUTE (<code>´</code>, <code>\u0301</code>) in identifiers. See The Unicode Standard, Section 3.11 &quot;Normalization Forms&quot;.</p>
</blockquote>
<blockquote>
<p>Examples of identifiers are:</p>
<ul>
<li><code>String</code></li>
<li><code>i3</code></li>
<li>αρετη</li>
<li><code>MAX_VALUE</code></li>
<li><code>isLetterOrDigit</code></li>
</ul>
</blockquote>
<p><del>The identifiers <code>var</code> and <code>yield</code> are <em>restricted identifiers</em> because they are not allowed in some contexts.</del></p>
<div class="editorial">
<p>In this revised approach, <code>var</code> and <code>yield</code> are <em>contextual keywords</em>. We keep the same restrictions on uses of <code>var</code> and <code>yield</code>, but the term &quot;restricted identifier&quot; no longer works—in contexts where they are &quot;restricted&quot;, they may not be identifiers at all.</p>
</div>
<p><strong>In some contexts, to facilitate the recognition of contextual keywords (<a href="#jls-3.9">3.9</a>), the syntactic grammar disallows certain identifiers by defining a production in terms of a subset of identifiers. These subsets are defined as follows:</strong></p>
<p><del>A <em>type identifier</em> is an identifier that is not the character sequence <code>var</code> or the character sequence <code>yield</code>.</del></p>
<dl>
<dt><em>TypeIdentifier:</em></dt>
<dd><em>Identifier</em> <em>but not</em> <code>var</code><strong>,</strong> <del><em>or</em></del> <code>yield</code><strong>, or <code>record</code></strong>
</dd>
</dl>
<blockquote>
<p><del>Type identifiers are</del> <strong>A <em>TypeIdentifier</em> is</strong> used in certain contexts involving the declaration or use of types. For example, the name of a class must be a <em>TypeIdentifier</em>, so it is illegal to declare a class named <code>var</code><strong>,</strong> <del>or</del> <code>yield</code><strong>,or <code>record</code></strong> (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-8.html#jls-8.1">8.1</a>).</p>
</blockquote>
<p><del>An <em>unqualified method identifier</em> is an identifier that is not the character sequence <code>yield</code>.</del></p>
<dl>
<dt><em>UnqualifiedMethodIdentifier:</em></dt>
<dd><em>Identifier</em> <em>but not</em> <code>yield</code>
</dd>
</dl>
<blockquote>
<p><del>This restriction allows <code>yield</code> to be used in a <code>yield</code> statement (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-14.html#jls-14.21">14.21</a>) and still also be used as a (qualified) method name for compatibility reasons.</del></p>
</blockquote>
<blockquote>
<p><strong>An <em>UnqualifiedMethodIdentifier</em> is used when referencing a method with a single identifier. An invocation of a method named <code>yield</code> must be qualified, to distinguish the invocation from a <code>yield</code> statement.</strong></p>
</blockquote>
<div class="editorial">
<p>The formal term &quot;type identifier&quot; is only used once outside of this section (<a href="#jls-6.1">6.1</a>), and &quot;unqualified method identifier&quot; is never used. It's simpler just to stick with the grammar production names.</p>
<p>The revised discussion about <em>UnqualifiedMethodIdentifier</em> mimics the <em>TypeIdentifier</em> discussion, describing where the restriction applies rather than getting into the design motivation. (Design motivation comes from the new introductory sentence, above.)</p>
</div>
<h3 id="jls-3.9">3.9 Keywords</h3>
<p>51 character sequences, formed from ASCII <del>letters</del> <strong>characters</strong>, are reserved for use as keywords and cannot be used as identifiers (<a href="#jls-3.8">3.8</a>). <strong>Another 12 character sequences, also formed from ASCII characters, <em>may</em> be interpreted as keywords, depending on the context in which they appear.</strong></p>
<div class="editorial">
<p>Note that <code>_</code> is not an ASCII letter. Neither is <code>-</code>, which is expected to appear in some contextual keywords like <code>non-sealed</code>.</p>
</div>
<dl>
<dt><em>Keyword:</em></dt>
<dd><strong><em>ReservedKeyword</em></strong>
</dd>
<dd><strong><em>ContextualKeyword</em></strong>
</dd>
<dt><strong><em>ReservedKeyword:</em></strong></dt>
<dd>(one of)
</dd>
<dd><code>abstract continue for new switch</code><br />
<code>assert default if package synchronized</code><br />
<code>boolean do goto private this</code><br />
<code>break double implements protected throw</code><br />
<code>byte else import public throws</code><br />
<code>case enum instanceof return transient</code><br />
<code>catch extends int short try</code><br />
<code>char final interface static void</code><br />
<code>class finally long strictfp volatile</code><br />
<code>const float native super while</code><br />
<code>_</code> (underscore)
</dd>
</dl>
<div class="inserted">
<dl>
<dt>ContextualKeyword:</dt>
<dd>(one of)
</dd>
<dd><code>exports opens to var</code><br />

</dd>
<dd><code>module provides transitive with</code><br />

</dd>
<dd><code>open requires uses yield</code>
</dd>
</dl>
</div>
<blockquote>
<p>The keywords <code>const</code> and <code>goto</code> are reserved, even though they are not currently used. This may allow a Java compiler to produce better error messages if these C++ keywords incorrectly appear in programs. The keyword <code>_</code> (underscore) is reserved for possible future use in parameter declarations.</p>
</blockquote>
<div class="inserted">
<p>A character sequence matching a contextual keyword is not treated as a keyword if any part of the sequence can be combined with the immediately preceding or following characters to form a different token.</p>
<blockquote>
<p>So the character sequence <code>openmodule</code> is interpreted as a single identifier rather than two contextual keywords, even at the start of a <em>ModuleDeclaration</em>. If two keywords are intended, they must be separated by whitespace or a comment.</p>
</blockquote>
<p>Any other character sequence matching a contextual keyword is treated as a keyword if and only if it appears in one of the following contexts of the syntactic grammar:</p>
<ul>
<li><p>For <code>open</code> and <code>module</code>, when appearing as specified as a terminal in a <em>ModuleDeclaration</em> (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-7.html#jls-7.7">7.7</a>)</p></li>
<li><p>For <code>requires</code>, <code>exports</code>, <code>opens</code>, <code>uses</code>, <code>provides</code>, <code>to</code>, and <code>with</code>, when appearing as specified as a terminal in a <em>ModuleDirective</em></p></li>
<li><p>For <code>transitive</code>, when appearing as specified as a terminal in a <em>RequiresModifier</em></p>
<blockquote>
<p>The directive <code>requires transitive;</code> does not make use of <em>RequiresModifier</em>, and so in this case <code>transitive</code> is interpreted as an identifier.</p>
</blockquote></li>
<li><p>For <code>var</code>, when appearing as specified as a terminal in a <em>LocalVariableType</em> (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-14.html#jls-14.4">14.4</a>) or a <em>LambdaParameterType</em> (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-15.html#jls-15.27.1">15.27.1</a>)</p>
<blockquote>
<p>In many other contexts, attempting to use the character sequence <code>var</code> as an identifier will cause an error, because <code>var</code> is not a valid <em>TypeIdentifier</em> (<a href="#jls-3.8">3.8</a>).</p>
</blockquote></li>
<li><p>For <code>yield</code>, when appearing as specified as a terminal in a <em>YieldStatement</em> (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-14.html#jls-14.21">14.21</a>)</p>
<blockquote>
<p>In many other contexts, attempting to use the character sequence <code>yield</code> as an identifier will cause an error, because <code>yield</code> is neither a valid <em>TypeIdentifier</em> nor a valid <em>UnqualifiedMethodIdentifier</em>.</p>
</blockquote></li>
</ul>
<blockquote>
<p>While these rules depend on details of the syntactic grammar, a compiler for the Java Programming Language can implement them without fully parsing the input program. For example, a heuristic could be used to track the contextual state of the tokenizer, as long as the heuristic guarantees that valid uses of contextual keywords are tokenized as keywords, and valid uses of identifiers are tokenized as identifiers. Or the compiler could always tokenize a contextual keyword as an identifier, leaving it to later stages to recognize special uses of these identifiers.</p>
</blockquote>
</div>
<div class="editorial">
<p>We're being intentionally vague about how an implementation might disambiguate. The above note should be enough to make clear that this is an implementation choice, not something we're going to specify explicitly. Still, as the language evolves, designers will need to be careful about the implementation impact of new contextual keywords in certain contexts.</p>
</div>
<blockquote>
<p>A variety of character sequences are sometimes assumed, incorrectly, to be keywords:</p>
</blockquote>
<blockquote>
<ul>
<li><p><code>true</code> and <code>false</code> are not keywords, but rather boolean literals (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-3.html#jls-3.10.3">3.10.3</a>).</p></li>
<li><p><code>null</code> is not a keyword, but rather the null literal (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-3.html#jls-3.10.8">3.10.8</a>).</p></li>
<li><p><del><code>var</code> and <code>yield</code> are not keywords, but rather restricted identifiers (<a href="#jls-3.8">3.8</a>). <code>var</code> has special meaning as the type of a local variable declaration (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-14.html#jls-14.4">14.4</a>, <a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-14.html#jls-14.14.1">14.14.1</a>, <a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-14.html#jls-14.14.2">14.14.2</a>, <a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-14.html#jls-14.20.3">14.20.3</a>) and the type of a lambda formal parameter (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-15.html#jls-15.27.1">15.27.1</a>). <code>yield</code> has special meaning in a <code>yield</code> statement (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-14.html#jls-14.21">14.21</a>). All invocations of a method named <code>yield</code> must be qualified so as to be distinguished from a <code>yield</code> statement.</del></p></li>
</ul>
</blockquote>
<div class="deleted">
<p>A further ten character sequences are <em>restricted keywords</em>: <code>open</code>, <code>module</code>, <code>requires</code>, <code>transitive</code>, <code>exports</code>, <code>opens</code>, <code>to</code>, <code>uses</code>, <code>provides</code>, and <code>with</code>. These character sequences are tokenized as keywords solely where they appear as terminals in the <em>ModuleDeclaration</em>, <em>ModuleDirective</em>, and <em>RequiresModifier</em> productions (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-7.html#jls-7.7">7.7</a>). They are tokenized as identifiers everywhere else, for compatibility with programs written before the introduction of restricted keywords. There is one exception: immediately to the right of the character sequence <code>requires</code> in the <em>ModuleDirective</em> production, the character sequence <code>transitive</code> is tokenized as a keyword unless it is followed by a separator, in which case it is tokenized as an identifier.</p>
</div>
<div class="editorial">
<p>These rules are captured in the new rules, above.</p>
<p>The rephrasing &quot;when appearing as specified&quot; avoids the need for a special exception for <code>transitive</code>: as noted, if the grammar isn't looking for a <em>RequiresModifier</em>, then <code>transitive</code> isn't in an appropriate context to be treated as a keyword.</p>
</div>
<h3 id="jls-3.12">3.12 Operators</h3>
<p>38 tokens, formed from ASCII characters, are the <em>operators</em>.</p>
<dl>
<dt><em>Operator:</em></dt>
<dd>(one of)
</dd>
<dd><code>= &gt; &lt; ! ~ ? : -&gt;</code><br />
<code>== &gt;= &lt;= != &amp;&amp; || ++ --</code><br />
<code>+ - * / &amp; | ^ % &lt;&lt; &gt;&gt; &gt;&gt;&gt;</code><br />
<code>+= -= *= /= &amp;= |= ^= %= &lt;&lt;= &gt;&gt;= &gt;&gt;&gt;=</code>
</dd>
</dl>
<div class="inserted">
<p>When the character <code>&gt;</code> appears in a type context (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-4.html#jls-4.11">4.11</a>)—that is, as part of a <em>Type</em> or an <em>UnannType</em> in the syntactic grammar (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-4.html#jls-4.1">4.1</a>, <a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-8.html#jls-8.3">8.3</a>)—it is always treated as the <code>&gt;</code> operator, even when it could be combined with an adjacent <code>&gt;</code> character to form a different operator.</p>
<blockquote>
<p>So the character sequence <code>List&lt;List&lt;String&gt;&gt;</code>, when appearing as a <em>Type</em>, ends with two <code>&gt;</code> operators.</p>
<p>Without this rule for <code>&gt;</code> characters, two consecutive <code>&gt;</code> brackets in a type such as <code>List&lt;List&lt;String&gt;&gt;</code> would be tokenized (<a href="#jls-3.5">3.5</a>) as the signed right shift operator <code>&gt;&gt;</code>, while three consecutive <code>&gt;</code> brackets in a type such as <code>List&lt;List&lt;List&lt;String&gt;&gt;&gt;</code> would be tokenized as the unsigned right shift operator <code>&gt;&gt;&gt;</code>. Worse, the tokenization of four or more consecutive <code>&gt;</code> brackets in a type such as <code>List&lt;List&lt;List&lt;List&lt;String&gt;&gt;&gt;&gt;</code> would be ambiguous, as various combinations of <code>&gt;</code>, <code>&gt;&gt;</code>, and <code>&gt;&gt;&gt;</code> tokens could represent the <code>&gt;&gt;&gt;&gt;</code> characters.</p>
</blockquote>
</div>
<h2 id="jls-6">Chapter 6: Names</h2>
<h3 id="jls-6.1">6.1 Declarations</h3>
<p>A <em>declaration</em> introduces an entity into a program and includes an identifier (<a href="#jls-3.8">3.8</a>) that can be used in a name to refer to this entity. The identifier is constrained to be a <del>type identifier</del> <strong><em>TypeIdentifier</em></strong> when the entity being introduced is a class, interface, or type parameter.</p>
<p>...</p>
<h3 id="jls-6.5">6.5 Determining the Meaning of a Name</h3>
<p>The meaning of a name depends on the context in which it is used. The determination of the meaning of a name requires three steps:</p>
<ul>
<li><p>First, context causes a name syntactically to fall into one of seven categories: <em>ModuleName</em>, <em>PackageName</em>, <em>TypeName</em>, <em>ExpressionName</em>, <em>MethodName</em>, <em>PackageOrTypeName</em>, or <em>AmbiguousName</em>.</p>
<p><em>TypeName</em> and <em>MethodName</em> are less expressive than the other five categories, because they are denoted with <em>TypeIdentifier</em> and <em>UnqualifiedMethodIdentifier</em> <strong>(<a href="#jls-3.8">3.8</a>)</strong>, respectively. <del>The former excludes the character sequences <code>var</code> and <code>yield</code> (<a href="#jls-3.8">3.8</a>), and the latter excludes the character sequence <code>yield</code>.</del></p>
<div class="editorial">
<p>This sentence is bound to become out of sync as new contextual keywords are added. It's repeating the discussion from 3.8. A cross-reference is sufficient.</p>
</div></li>
<li><p>Second, a name that is initially classified by its context as an <em>AmbiguousName</em> or as a <em>PackageOrTypeName</em> is then reclassified to be a <em>PackageName</em>, <em>TypeName</em>, or <em>ExpressionName</em>.</p></li>
<li><p>Third, the resulting category then dictates the final determination of the meaning of the name (or a compile-time error if the name has no meaning).</p></li>
</ul>
<dl>
<dt><em>ModuleName:</em></dt>
<dd><em>Identifier</em>
</dd>
<dd><em>ModuleName</em> <code>.</code> <em>Identifier</em>
</dd>
<dt><em>PackageName:</em></dt>
<dd><em>Identifier</em>
</dd>
<dd><em>PackageName</em> <code>.</code> <em>Identifier</em>
</dd>
<dt><em>TypeName:</em></dt>
<dd><em>TypeIdentifier</em>
</dd>
<dd><em>PackageOrTypeName</em> <code>.</code> <em>TypeIdentifier</em>
</dd>
<dt><em>PackageOrTypeName:</em></dt>
<dd><em>Identifier</em>
</dd>
<dd><em>PackageOrTypeName</em> <code>.</code> <em>Identifier</em>
</dd>
<dt><em>ExpressionName:</em></dt>
<dd><em>Identifier</em>
</dd>
<dd><em>AmbiguousName</em> <code>.</code> <em>Identifier</em>
</dd>
<dt><em>MethodName:</em></dt>
<dd><em>UnqualifiedMethodIdentifier</em>
</dd>
<dt><em>AmbiguousName:</em></dt>
<dd><em>Identifier</em>
</dd>
<dd><em>AmbiguousName</em> <code>.</code> <em>Identifier</em>
</dd>
</dl>
<blockquote>
<p>The use of context helps to minimize name conflicts between entities of different kinds. Such conflicts will be rare if the naming conventions described in <a href="#jls-6.1">6.1</a> are followed. Nevertheless, conflicts may arise unintentionally as types developed by different programmers or different organizations evolve. For example, types, methods, and fields may have the same name. It is always possible to distinguish between a method and a field with the same name, since the context of a use always tells whether a method is intended.</p>
</blockquote>
<h4 id="jls-6.5.7">6.5.7 Meaning of Method Names</h4>
<h5 id="jls-6.5.7.1">6.5.7.1 Simple Method Names</h5>
<p>A simple method name appears in the context of a method invocation expression (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-15.html#jls-15.12">15.12</a>). The simple method name consists of a single <del><em>Identifier</em></del> <strong><em>UnqualifiedMethodIdentifier</em></strong> which specifies the name of the method to be invoked. The rules of method invocation require that the <del><em>Identifier</em></del> <strong><em>UnqualifiedMethodIdentifier</em></strong> either denotes a method that is in scope at the point of the method invocation, or denotes a method imported by a single-static-import declaration or static-import-on-demand declaration (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-7.html#jls-7.5.3">7.5.3</a>, <a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-7.html#jls-7.5.4">7.5.4</a>).</p>
<div class="example">
<p>Example 6.5.7.1-1. Simple Method Names</p>
<p>The following program demonstrates the role of scoping when determining which method to invoke.</p>
<pre><code>class Super {
    void f2(String s)       {}
    void f3(String s)       {}
    void f3(int i1, int i2) {}
}

class Test {
    void f1(int i) {}
    void f2(int i) {}
    void f3(int i) {}

    void m() {
        new Super() {
            {
                f1(0);  // OK, resolves to Test.f1(int)
                f2(0);  // compile-time error
                f3(0);  // compile-time error
            }
        };
    }
}</code></pre>
<p>For the invocation <code>f1(0)</code>, only one method named <code>f1</code> is in scope. It is the method <code>Test.f1(int)</code>, whose declaration is in scope throughout the body of <code>Test</code> including the anonymous class declaration. <a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-15.html#jls-15.12.1">15.12.1</a> chooses to search in class <code>Test</code> since the anonymous class declaration has no member named <code>f1</code>. Eventually, <code>Test.f1(int)</code> is resolved.</p>
<p>For the invocation <code>f2(0)</code>, two methods named <code>f2</code> are in scope. First, the declaration of the method <code>Super.f2(String)</code> is in scope throughout the anonymous class declaration. Second, the declaration of the method <code>Test.f2(int)</code> is in scope throughout the body of <code>Test</code> including the anonymous class declaration. (Note that neither declaration shadows the other, because at the point where each is declared, the other is not in scope.) <a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-15.html#jls-15.12.1">15.12.1</a> chooses to search in class <code>Super</code> because it has a member named <code>f2</code>. However, <code>Super.f2(String)</code> is not applicable to <code>f2(0)</code>, so a compile-time error occurs. Note that class <code>Test</code> is not searched.</p>
<p>For the invocation <code>f3(0)</code>, three methods named <code>f3</code> are in scope. First and second, the declarations of the methods <code>Super.f3(String)</code> and <code>Super.f3(int,int)</code> are in scope throughout the anonymous class declaration. Third, the declaration of the method <code>Test.f3(int)</code> is in scope throughout the body of <code>Test</code> including the anonymous class declaration. <a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-15.html#jls-15.12.1">15.12.1</a> chooses to search in class <code>Super</code> because it has a member named <code>f3</code>. However, <code>Super.f3(String)</code> and <code>Super.f3(int,int)</code> are not applicable to <code>f3(0)</code>, so a compile-time error occurs. Note that class <code>Test</code> is not searched.</p>
<p>Choosing to search a nested class's superclass hierarchy before the lexically enclosing scope is called the &quot;comb rule&quot; (<a href="https://docs.oracle.com/javase/specs/jls/se15/html/jls-15.html#jls-15.12.1">15.12.1</a>).</p>
</div>
</main><footer class="legal-footer"><hr/><a href="../legal/copyright.html">Copyright</a> &copy; 1993, 2021, Oracle and/or its affiliates, 500 Oracle Parkway, Redwood Shores, CA 94065 USA.<br>All rights reserved. Use is subject to <a href="https://www.oracle.com/java/javase/terms/license/java17speclicense.html">license terms</a> and the <a href="https://www.oracle.com/technetwork/java/redist-137594.html">documentation redistribution policy</a>. <!-- Version 17.0.2+8-LTS-86 --></footer>
</body>
</html>